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1. Introduction 


During the past fifteen years considerable effort has been devoted to 
the problem of estimating unknown parameters in distributed parameter 
systems. The recent book by Banks and Kunisch 1 9] provides an excellent 
account of the progress made in the field. Many parameter estimation 
problems are best formulated as optimization problems (often over infinite 
dimensional "parameter spaces") and algorithms are developed to minimize an 
appropriate cost function. Although there are several approaches to these 
problems, their infinite dimensional nature requires that numerical 
approximations be introduced at some point in the analysis. Consequently, 
there are two basic classes of algorithms for optimization based parameter 
estimation. The first type of algorithm, and the most frequently used for 
dynamic problems, is indirect and proceeds by initially approximating the 
dynamic equations (e.g. finite elements, finite differences, etc.) and then 
using optimization algorithms on the finite dimensional problem. This 
approach is typified by the papers [1] — [61, (81, [10], and [17]. The 
second more direct approach is based on the direct application of an 
(perhaps infinite dimensional) optimization algorithm and employing 
numerical approximations at each step of the algorithm to compute the 
necessary solutions of the dynamic equations. This approach is used in 
[12], [131, and [181. Both methods have advantages and disadvantages. 
Depending on the particular type of distributed parameter system, one 
method may out perform the other. 

Direct methods such as quasi 1 inearization considered here are often 
limited by the fact that the dependence on unknown parameters of the 
solution to the infinite dimensional dynamical equations may not be "smooth 
enough" to establish convergence of the algorithm. Indeed, some algorithms 
may not be properly defined without this necessary smoothness. Indirect 
methods avoid this difficulty and often lead to easily implemented 
algorithms. On the other hand, when direct methods can be applied it is 
sometimes possible to establish the convergence and the rates of 
convergence to the unknown optimal parameters (see [13], [18]). 
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This paper considers the dependence on an unknown parameter q of the 
solution of the linear abstract Cauchy problem 

( x(t) = A(q)x( t) + u(t), 0 < t < T, 

(11) [ x(0) = x Q . 

Our ultimate goal is to formulate and establish the convergence of a 
gradient-based parameter estimation algorithm applicable in this abstract 
setting. 

This algorithm employs computation of the gradient D x(t;q) of the 

solution of (1.1) with respect to the parameter. Conditions for the 
existence of this gradient are established in fill. In Section 2 we review 
these conditions and the general setting for the remainder of the paper. 
Convergence of the algorithm requires certain smoothness properties of the 
gradient D^xftjq) with respect to q. These properties are established in 

Section 3 and their applicability to a linear delay-differential equation 
is discussed in Section 4. In this example the delay is among the 
parameters so that in this setting the parameter dependence appears in 
unbounded terms of the evolution operator A(q). 

An abstract parameter estimation algorithm is presented in Section 5. 
In Section 6 its convergence is established using the results of Section 3. 
In Section 7 we present several numerical examples which indicate the 
performance of the algorithm for delay and coefficient estimation in linear 
delay-differential equations. Additional examples may be found in [12]. 
Numerical testing and evaluation on a wider variety of parameter estimation 
problems will be undertaken in a subsequent paper. 

2, The General Setting 


The application of quasi 1 inearization to parameter estimation requires 
knowledge of the derivative of the state with respect to the unknown 
parameter. This topic is addressed in [11]. In this section we review the 
framework used tkere to obtain differentiability and establish notation to 
be used in the remainder of this paper. 
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Let P be an open subset of a normed linear space P with norm | • | and 
let X be a Banach space with norm ||*||. For every q e P let A(q) be a 
linear operator on D(A(q)) in X. Throughout this paper we assume 

(HI) A(q) generates a strongly continuous semigroup S(t;q) on X; 

(H2) DC A( q ) ) = D is independent of q; 

(H3) ||S(t;q)x|| < Me Wt ||x||, x € X, t > 0, q € D, for some constants 

M and w independent of q, x, and t. 

Fix T > 0 and u € l‘( 0,T;X). Define Q(t;q) = f S( t-s ;q)u(s)ds for q € P, 

J 0 

0 < t < T. Note that if (1.1) has a strong solution then it is given by 
the formula x(t) = S(t;q)x^ + Q(t;q) for 0 < t < T. 

In applications of this theory it is useful to consider just those 
terms of A(q) in which the parameter appears. To this end we write 
A(q) = A + B(q) where A and B(q) both have domain D and A is independent 
of q. Concerning B(q) we assume the following: 

(H4) For every q , q^ e P there is a constant K such that 

T 

f ||B( q ) SC t;q n )x||dt < K||x|| for all x € D. 

J 0 U 

In Section 4 we discuss an example in which an unbounded operator B(q) 
satisfies (H4). This hypothesis does imply, however, that the linear 

mapping x -» B(q)S( • ; q^)x is bounded as a mapping from D into L (0,T;X). 

Let F(q,qg) denote the bounded linear extension of this operator to X. Let 

|| • || denote the norm in L'lO.TjX). Concerning F we assume the following: 
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(H5) There is closed subspace Y of X such that 

(i) F(q .Q qIXq G L 1 (0,T; Y) for q, q Q € P, and 

(ii) for every e P and € > 0 there exists S > 0 such that 
||F(q , q Q )y - F(q 0 ,q () )y|| i < e||y|| for y e Y and 

Id - d 0 l - 6 - 

The analogue of F for the function Q(t;q) is the mapping G(q,q^) from 
L 1 (0,T;D) into L^O.TjX) defined by 

[G(q ,q Q )w] ( t ) = J B(q)S( t-s ;q Q )w( s )ds . 

By (H4) is follows that G can be extended to a bounded linear mapping on 
L (0,T;X) so that in particular G(q,q^)u is defined as an element of 

L*(O f T;X). In addition we assume 

(H6) G(q,q 0 )u e L^O.TjY) for q, q Q € P 

where Y denotes the subspace required by (115). 

3. Parameter Dependence 

In this section we deduce smoothness properties of the solution 
x(t;q) = S(t;q)Xg + Q(t;q) with respect to q. These properties are derived 

from similar properties of F(q,qg) and G(q,q^) which are operators related 

to A(q) . These results will be used in Section 5 to prove convergence of 
the parameter estimation algorithm. Throughout this section T > 0, x Q e X, 
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and u € l'CO.T.X) are fixed as given in (1.1). The symbol denotes 

Frechet differentiation with respect to q. These results are given as a 
series of lemmas whose proofs are at the end of this section. 

Lemma 3.1 . Suppose (HI) - (H5) hold. In addition, suppose that for a 
gi ven q* € P 

(H7) F(q,q 0 )x 0 is Frechet differentiable with respect to q at q Q 

for every q Q e P. 

For brevity, let DF(q Q ) denote D q [F(q,q 0 )x 0 l ^ OT q 0 G P ' In addlilon ’ 

suppose 

(H8) DF(q) is strongly continuous in q at q*, that is, for each 

h G P the mapping q -* DF(q)h from P into L^O.TjX) is 
continuous at q*. 

Then for each t € [0,T], S(t;q)x 0 is Frechet diffentiable with respect to q 
at every q e P and D q f S( t ;q)x Q ] is strongly continuous with respect to q 

at q* . 

Lemma 3.2 . Suppose (HI) - (H6) hold and in addition suppose that for a 
given q* 6 P, 

(H9) G(q,q 0 )u is Frechet differentiable with respect to q at q Q 
for every q Q € P. 

Again denoting this derivative by DG(qg) for q^ € P, assume 


(H10) DG(q) is strongly continuous in q at q*. 
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Then for t e [0,T], Q(t;q) is Frechet differentiable with respect to q at 
every q e P and D^[Q(t;q)] is strongly continuous in q at q*. 

Lemma 3.3 . Suppose (HI) - (H5) and (H7) hold and in addition suppose 

(Hll) F(q,q*) is locally Lipschitz continuous in q at q*, uniformly 
for y 6 Y, that is, there exist constants Kj , 6 > 0 such that 

||F( q , q* ) - F(q,q*)>'|| i < Kjq - q* | ||y|| 

whenever |q - q*| < ^ and y € Y. 

Moreover, assume that 

(H12) DF(q) is strongly locally Lipschitz continuous with respect 

to q at q*. That is, for each h s P, there are constants 
K, 6 > 0 such that 

||DF(q)h - DF(q*)h|| < K|q - q*| 

for | q - q*| < 6. 

Then D f S( t ; q } X q 1 is strongly locally Lipschitz continuous with respect to 
q at q* for every t € [0,T], 

Lemma 3.4. Suppose (HI) - (H6), (H9) - (H10) hold and in addition suppose 

(H13) DG(q) is strongly locally Li pschitz cont i nuous with 
respect to q at q*. 


Then D^[Q(t;q)] is strongly locally Lipschitz continuous with respect to q 


at q* for every t € [0,T], 
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A1 though the assumptions (HI) - (H13) are rather technical, we shall 
see that they can be easily verified for delay systems even in the case 
that the unknown parameter is the delay itself. Therefore, the results 
presented here remove the limitations placed on the perturbation B(q) in 
papers [13] and [16]. 

For completeness we now present the proofs of Lemma 3.1 - Lemma 3.4. 
However, these proofs make use of the basic results found in [11] and in 
order to keep the length of the proofs reasonable we assume that the reader 
has [ 11 ] in hand . 

Proof of Lemma 3.1 . It is shown in [11] that (HI) - (H5), (H7) imply that 
D [S(t;q)x Q ] exists for q e P. Furthermore, it is given by the formula 

r l 

(3.1) D [S(t;q)x n ]h = S( t-s ; q) [DF(q)h] (s)ds , h e P. 

q 0 J 0 

We therefore obtain by substitution 

(3.2) D [S(t;q)x ]h - D [ S( t;q*)x n lh 

q u q u 

p t 

= [ S( t-s ; q ) - S(t-s;q + ) ]([DF(q)hl(s))ds 

J 0 

«t 

+ S(t-s;q*)([DF(q)h](s) - [DF(q*)h](s))ds. 

J 0 

Let f > 0 be given and let C = Me W ^. It can be shown (see the proof of 
Theorem 1 [11]) that for all x € X 


(3.3) 


||S( t;q)x - S(t;q*)x|| < C||F(q,q*)x - F(q* ,q*)x|| l . 
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Combining (3.3) with (H5ii) shows that for some > 0 

||S(t,q)y - S(t;q*)y|| < fC||y||, 0 < t < T, y e Y, 

whenever |q - q*| < ^ . In particular, putting y = (DF(q)h](s) e Y by 
( 115 i ) we obtain 

II t SC t-s ;q) - S(t-s;q*)][DF(q)hl(s)|| < eC||f DF(q)h] ( s ) || 

for |q - q*| < 6 ^, a.e. s G (0,T). Since DF(q)h is continuous at q*, there 

exist constants K , 6 >0 such that 

2 2 

||DF( q ) h 1 1 x < K 2 for |q - q* | < 6 ^ 

Combining these estimates shows that the first term in (3.2) is bounded 
by cCK 2 if | q - q* | < minC^,^). 

Using (H8) it is easy to see that there exists 6 >0 such that the 

3 

second term in (3.2) is bounded by tC for |q - q*| < 6 ^. These estimates 
complete the proof of Lemma 3.1. 

Proof of Lemma 3.2 . By Theorem 3 of f 1 1 ] , D fQ(t;q)] exists for q e P and 

(3.4) D q [Q(t;q)] - DlQ(t;q*)] 

= f [ S( t-s ; q ) - S( t-s ; q* ) ] [DG(q ) ( 8 ) Ids 
J 0 

+ f S(t-s;q*)[(DG(q))(s) - (DG(q*))(s) ]ds 
J 0 

where u has been suppressed in the notation. Since DG(q) G L^O.TjY) for 
q 6 P by (H6), the proof follows exactly as in the proof of Lemma 3.1. 
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Proof of Lemma 3.3 . Let e > 0 be given. By (3.3) and ( HI 1 ) there exists 
> 0 such that 

||S(t;q)y - S(t;q*)y|| < CKj ||y|| | q - q*| 

for y e Y and |q - q*| < 6^. Since DF(q)h € L'tO.TjY) by (H5i) we have as 

in the proof of Lemma 2.1 that the first term of (3.2) is bounded by 

K K |q - q*| for |q - q*| < min (6 ,6 ) . An estimate of the same form is 

easily obtained for the second term of (3.2) using (H12) . These estimates 
complete the proof of Lemma 3.3. 


Proof of Lemma 3.4 . Since DG(q)u e L'tO.TjY) by (H6), the proof follows 
exactly as in the proof of Lemma 3.3 using (3.4) in place of (3.2). 

4. Application to a Delay-Differential Equation 

In this section we apply the framework of the previous sections to the 
linear delay-differential equation 


(4.1) 


n 

x(t)= a^x(t) + E a^x(t - q^) + u(t) 
k=l 


x(0) = rj 

x 0 = p- 


Let P = R n , fix r > 0, and let P = {q = (q ,q , . . . ,q) : 0 < q, < r 

12 n k 

for k = 1,2 n). In equation (4.1), rj e R, a^ e R, k = 0,1,. . ,,n, 

<p e L 1 (-r, 0) with norm denoted by ||p|| , u e L^O.T), and x,(s) = x(t+s) 

I l 

for t > 0, -r < s < 0. By a solution of (4.1) we mean a function x which 
is absolutely continuous on l 0 , T ] and satisfies (4.1) almost everywhere on 
(0,T) . 
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Following the construction in [14], we take X = R x i/t-r.O) with norm 
||(r 7 ,p)|| = | *7 i + IMIj and define for q 6 P an operator A(q) on 


D = {(.rf,<p) € X: <p is abs . cont. on [-r,0], Ip € i/t-r.O), and 

<p(0) = tj] 


by 


n 

A(q)(»7,V5) = ( a Q p(0) + E a k ^(-q k ), Ip ). 

k=l 


Then is well known that A(q) generates a strongly continuous semigroup 
S(t;q) on X satisfying S(t;q) = (y(t), y^) where y(t) = y(t;q) denotes the 

solution of (4.1) with u = 0. It is a consequence of standard results that 
(HI) - (H3) hold in this setting. 

For q = (q , . . . ,q n ) an d q^ in P, (r},tp) € X, and w e 1/(0, T) it 

follows that in this example the mappings F and G of Section 3 are given by 

n 

(4.2) F(q ,q Q ) (r/,(p) = ( E a R y( t-q k ;q Q ) , 0 ) 

k=l 


and 


n 

(4.3) [G(q,q 0 )w](t) = ( E a k z( t-q k ; q Q ) , 0 ) 

k=l 


for a.e. t e (0,T) where z(t;q) denotes the solution of (4.1) with u = w 
and (rj,<p) = (0,0). It is shown in [11] that these mappings satisfy 
(H4) - (H6) with the closed subspace Y = R x {0}. It is also shown in [11] 
that F and G satisfy the differentiability hypotheses (H7) and (H9) for 
{r},ip) = Xq e D and q,q Q € P. Furthermore, their Frechet derivatives are 
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given by 

n 

(4.4) [DF(q)hKt) = ( - E a k y(t-q k ;q k )h k , 0 ) 

k=l 

and 

n 

(4.5) [DG(q )h ] ( t ) = ( - E a k z( t-q k ;q k )h k> 0 ) 

k=l 

for q G P, h = (h , . . . .h^) € IR n , where y(t;q) is the solution of (4.1) 

with u = 0 and z(t;q) is the solution of (4.1) with (r),<p) = (0,0). 

It remains to establish conditions under which (H8), (H10) - (H13) are 
satisfied. 

Lemma 4.1 . Fix q* = (qj,. . . ,q*) G P and x Q G D. Then F(q,q*)x Q as 
defined by (4.2) satisfies (Hll). 

Proof : In Section 5 of [111 it is shown that there is a constant C 2 such 

that 


||F( q ,q* ) ( 77 , 0 ) - F(q*,q*)(»l,0)|| 1 < C 2 |h | ||(r7,0) || 

n 

for q G P, h G R n , r/ 6 1R. Here we define |h| = E |h, |. This estimate is 

k=l 

equivalent to (Hll) with Y = 1R x (0). 

Lemma 4.2 . Suppose x Q = G D. Then DF(q) as given by (4.4) satisfies 

(H8). Moreover, if in addition £> is of bounded variation on [-r,0], then 
DF(q) satisfies (Hll). 
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Proof: Let A = max a, and 

m . 1 k' 

k 


= max |h, |. Then we obtain the estimate 


n -T 

(4.6) j | DF ( q ) h - DF(q*)h|| i < Ajh| E J |y(t-q k ;q) - y( t-q fc ;q*) |dt 


n „T 


A m | h | eJ |y(t-q k ;q*) - y( t-q k ; q* ) | dt . 


k=l w 0 


Now from (4.1) we obtain 


T 

(4.7) f |y(t-q, ;q) - y ( t-q. ; q* ) | dt < f |y(t;q) - y(t;q*)|dt 
J 0 J 0 

n T 

< A E I* |y( t-q . ;q) - y( t-qt ;q*) |dt 
m j = l J 0 J J 


n „T 


< A E f | y ( t-q . ; q ) - y(t-q ,;q*)|dt 
m j=l J 0 J 3 


n „T 


+ A 


E f |y( t-q . ;q*) - y(t-qt;q*)|dt 
j=l J 0 J J 


n „T 

< A E | y ( t ; q ) - y(t;q+) |dt 

m j = l J 0 


n «T 


+ A E f | y( t-q . ; q* ) - y( t-qt ;q*) |dt . 
j = l d 0 J J 


Now since y ( t ; q ) = S(t;q)x_ is differentiable with respect to q it is not 
difficult to show that there are constants /? and 6 such that 
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T 

(4.8) f | y ( t ; q) - y(t;q*)|dt < /9|q - q* | 

J 0 


whenever |q — q* < 6. Combining (4.7) and (4.8) with (4.6) yields 


(4.9) ||DF(q)h - DF(q* )h|| i < A*|h|n/?|q - q*| 

n „T 

+ A 2 |h | n E |y(t-q, ;q*) - y( t-q* ; q* ) | dt 
m k=l J 0 k 

n ^ 

+ A | h | E f |y( t-q ;q*) - y( t-q* ; q* ) | dt . 
m k=l J 0 k k 


Since (rj,ip) 6 D, we have y and y in L ! (-r,T). Therefore, the integral 

terms in (4.9) approach zero as q -» q* and (H8) holds. If <p is of bounded 

variation on [ — r , 0 ] , then y and y are of bounded variation on [-r,T]. By 
[15, Theorem 2.1.7(b)] this implies that the integral terms in (4.9) are 
0( |q - q* | ) as q -* q* so that ( HI 1 ) holds. 

Lemma 4,3 . Suppose u e L^O.T). Then DG(q) as defined by (4.5) satisfies 

(H10). Moreover, if in addition u is of bounded variation on [0,T], then 
DG(q) satisfies (H13). 

Proof: Using (4.5) in place of (4.4) one obtains the estimate (4.9) above 

with y replaced by z. Now if u 6 L^O.T) then z and z are in L^-r.T) so 
that (H10) holds. Similarly, if u is of bounded variation on [ 0 , T] , then z 

and z are of bounded variation on [— r,T] bo that (H13) is satisfied. 

5. The Algorithm 

In this section we define a parameter estimation algorithm based on 
quasilinearization and establish local convergence using the results of 
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Section 3. Here we assume that the parameter space P is R n with canonical 
basis e . , i = 1 , 2, . . . , n . 

Given x^ € D and q € P C R a strong solution of (1.1) is given by 
S(t;q)x Q + Q(t;q). Here we have used the notation of Section 2. Let C be 

/ 

a bounded linear mapping from X into R and define 

7(t;q) = C[S( t ; q ) x Q + Q(t;q)]. 

The parameter estimation algorithm is related to the following optimization 
probl em . 

Problem 5,1 . Let yj € IR , j = 1, 2, . . . , m be data values taken at 

times tj e [0, Tl, j = 1, 2 m, respectively. For q e P define the 

quadratic cost function 

m 

J(q) = £ ItU. ;q) - y, | 2 . 

j=i J J 

Find q* € P such that J(q*) < J(q) for all q e P. 

The quas i 1 i near i za t i on method defines a recursive algorithm whose 
fixed point is a local solution of Problem 5.1. A more complete 
exposition is given in [7], Given an initial guess q^ € P define 

Q k+1 = f(q k } ’ k = 

where 

f(q) = q - [D(q)l" 1 b(q) 
m T 

D(q) = E M 1 ( t . ;q)M( t . ;q) 
j = l J J 
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^ rp 

b(q) = E M (t.;q)['r(t.;q) - y ■] 
j=l J J J 

and the matrix M(t;q) has its i column M (t;q) given by 

M j (t;q) = CD q [S(t;q)x 0 + Q(t;q)]e., i = 1,2,3, ...,n. 

Lemma 5.1 . Suppose the hypotheses of Lemmas 3.1 and 3.2 are satisfied. 
Then M( t ^ ; q ) is continuous in q at q* . 

Proof. This is a direct consequence of Lemmas 3.1 and 3.2 and the above 
def ini tions . 

Lemma 5.2 . Suppose the hypotheses of Lemmas 3.3 and 3.4 are satisfied. 
Then there exist constants a, 6 > 0 such that 


|M(t.;q) - M( t . ; q* ) | < «|q - q* | • 

J J 

for |q - q | < b, j = 1,2,. ...m. 

Proof. This is a direct consequence of Lemmas 3.3 and 3.4 and the above 
def ini tions . 

We can now prove the following convergence results. These results are 
typical of quas i 1 inear i zati on methods and the proofs given here are in the 
same spirit as those in [71. We obtain superlinear convergence when there 
is an exact fit to data (Theorem 5.1) and linear convergence in the 
presence of error (Theorem 5.2). 
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Theorem 5.1 . Suppose the hypotheses of Lemmas 3.1 and 3.2 are satisfied. 

Moreover, assume [D(q*)] _I exists, f(q*) = q*, and J(q*) = 0. Then for 
every £ > 0 there exists S > 0 such that 


~ f(q*)| < £ | q - q*| 

for | q - q* | <6. In particular, there is a neighborhood tl of q* such that 
q^ -* q* <75 k -> oo whenever q^ e U. 

Proof- Note that f(q*) = q* implies that b(q*) = 0, or 


(5.1) 


E M (t. ;q*)[ 7 (t. ;q+) - y.] = 0. 
j=l 3 J J 


Therefore 


f(q) - f (q*) = D(q) 1 [ D(q) (q - q*) - b(q) ] 


= D(q) 


-1 


E M ( t . ; q ) f M( t . ; q ) (q - q*) - (qf(t.;q) - y.)J 
j=i j j j j 


-1 T 

= D(q) E M ( t . ; q ) [ M( t . ; q ) - M(t.;q*)](q — q*) 
j = l J J J 


D(q) 1 E M T (t ;q)[7(t ;q) - 7(t ;q*) - M(t.;q*)(q - q*)] 
j=l J J J J 


D(q) 1 E M^( t . ;q) [ 7 ( t . ;q*) - y.J. 
j=l J J J 
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Therefore, using (5.1) we have that 
(5.2) f (q) - f(q*) = 

-1 m T 

D(q) E M 1 (t.;q)|M(t.;q) - M(t ;q*)l(q - q*) 
j=l J J J 

-I m T 

- D(q ) E M (t. ;q)[ 7 (t. ;q) - 7 (t.;q*) - M(t .;q)(q - q*)] 

j=l J J J J 

-1 m T T 

- D(q) E [M X (t ;q) - M 1 ( t . ; q* ) ] [ 7 ( t ; q* ) - y.]. 

j=l J * J J 

Note that D(q) 1 exists and is bounded in a neighborhood of q* since 

D(q* ) 1 exists by assumption and D(q) -1 is continuous at q* by Lemma 5.1. 

Let e > 0 be given. Using Lemma 5.1 it is easy to see that there 
exist constants , 6 ( >0 such that the first term in (5.2) is bounded by 

_ Q*| for |q - q*| < <5^ Furthermore, since M(tjjq*) is the Frechet 

derivative of 7 (t. ;q) at q*, one can show that there exist constants 

P , 6 > 0 such that the second term of (5.2) is bounded by efl |q - q*| for 

|q - q*| < 6^. Combining these estimates with (5.2) yields 

(5.3) | f (q) - f (q* ) | < 

- q* | + | D( p ) ! | E | M T ( t . ; q ) - M T (t.;q*)| | 7 (t.;q*) - y . | , 
j=l ^ J J J 

for |q - q*| < 6 = min (6 .6 ) and /? = /? + /J . Since J(q*) = 0, the last 

1 * 12 

term in ( 5 . 3 ) i s zero. This estimate yields the desired result. 
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The following theorem does not require an exact fit to data, but does 
place some technical restrictions on the behaviour of M near q*. Note 
that if Lemmas 3.3 and 3.4 hold then there exists 3 > 0 such that for 
0 < 5 < 3 there exists a constant K (3) such that 


E |M T ( t . ;q) - M T ( t . ;q* ) | < K(3)|q - q*| 
j=l J J 


for |q - q* | < 3. Let K* = lim sup K(3) and define 

3 | 0 

(5.4) A* = K*|D(q*) _1 | max | qf( t j ; q* ) - yj | . 

Theorem 5.2 . Suppose the hypotheses of Lemmas 3.3 and 3.4 are satisfied. 

Moreover, assume [D(q*)J 1 exists and f(q*) = q*. Let A* be defined by 
(5.4) and assume A* < 1 . Then there exists 6* > 0 such that 

| f (q) - f(q*)| < A* | q - q*| 

for |q - q* | < 3*. In particular, q^ -» q* as k -* oo whenever 

I q Q - <1*1 < &*■ 

Proof. This estimate is a direct consequence of (5.3). 

6. Numerical Examples 

In this section we consider several examples in which the above 
algorithm was used to solve parameter estimation problems in delay- 
differential equations. In these examples the emphasis is on delay 
identification since in the abstract setting this represents an unbounded 
perturbation of the generator as noted in Section 4. 
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With the exception of Example 6.8, the various unknown parameters are 
estimated using data generated from closed-form expressions for the 
solution found by the "method of steps". The algorithm is implemented by 
an averaging scheme [2J which approximates the state equation and the 
associated sensitivity equations by a system of ordinary differential 
equations. This system is solved by a fourth-order Runge-Kutta routine. 

In the one delay examples the averaging scheme is implemented with the 
delay interval [ — r , 0 ] divided into sixteen equal segments, except that 
Example 6.8 uses 64 equal segments. In the two delay examples the 
intervals [ — r 2 , -rl] and [ — r 1 , 0 ] are divided into sixteen equal segments. 
All computations were done on a VAX 11/750 minicomputer or a SUN 
Microsystem at the Institute for Computer Applications in Science and 
Engineering (ICASE). 

Example 6.1 . This example illustrates the rapid convergence of the method 

for a single unknown parameter — the delay in the following equation — with 
an initial guess which is an order of magnitude greater than the "true 
value" of r = 1.0. The equation and the results of the iteration are given 
below. 


X ( t ) = 

-2x(t) + 3x(t- 

-r), t > 0 

X ( t ) = 

t + 1, t < 0 


i terate 

r 

error 

0 

10.000 

34.056 

1 

1.299 

0.955 

2 

0.946 

0.175 

3 

0.989 

0.115 

4 

0.987 

0.115 


The convergence of the states to ten data points on the interval [0,2] is 
illustrated in Figure 1. 
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Example 6.2 . The data is the same as for Example 6.1, however in this case 

the algorithm is asked to estimate the coefficients as well as the delay. 
The equation shows an insensitivity to the individual coefficients which 
leads to the inaccuracy in the converged estimates. In fact, because of 
errors introduced by the averaging scheme for computing the state, the 
estimated values fit the data better than the "true values” used to compute 
the data by the method of steps. The "true values" are a = -2, b = 3, and 
r = 1. The equation and the results of the iteration are given below: 


f x(t) = ax(t) + bx(t-r), t > 0 

l x ( t ) = t + 1, t < 0 


i terate 

a 

b 

r 

error 

0 

-4.000 

7.000 

2.000 

3.379 

1 

-0.815 

3.537 

1 . 184 

2.968 

2 

-1.596 

3.342 

1.122 

0.775 

3 

-2.403 

3.713 

1.002 

0.188 

4 

-2.250 

3.361 

1.015 

0.094 

5 

-2.352 

3.483 

1 .006 

0.093 


The convergence of the states is illustrated in Figure 2. 

Example 6.3 . This case illustrates the effect of a forcing function on the 
state equation. The nonhomogeneous delay-differential equation 


( x(t) = ax(t) + bx(t-r) + u(t), t > 0 
\ x( t) = t + 1 , t < 0 


where 


u( t) = 


0 , t < 0.1 

1 , t > 0.1 
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is solved in closed form by the method of steps with parameter values 
a = -2, b = 3, r = 1 as in Example 6.2. The results of the parameter 
estimation algorithm are given below: 


i terate 

a 

b 

r 

error 

0 

-4.000 

7.000 

2.000 

4.0527 

1 

1.022 

3.165 

1.140 

39.2657 

2 

-2.637 

23.652 

1.168 

24.9577 

3 

-5.979 

28.631 

1.141 

11.6964 

4 

-8.034 

23.250 

1.118 

3.5425 

5 

-5.167 

5.417 

1.028 

2.0471 

6 

-1.239 

4.195 

1.008 

4.8981 

7 

-2.861 

6.222 

1.005 

1.8930 

8 

-2.485 

3.795 

0.998 

0.0819 

9 

-2.115 

3.201 

1.013 

0.0724 

10 

-2.247 

3.380 

0.998 

0.0691 


The results are similar to those of Example 6.3, except that the solution 
has become somewhat more sensitive to the coefficients. 

Example 6.4 . This example indicates the ability of the algorithm to 

estimate two unknown delays. The algorithm converges rapidly from a 
relatively poor initial guess. The "true values" are r^ = 1.0 and 

r = 2.0. The equation and the results of the parameter estimation 
z 

algorithm are given below and the convergence of the states to ten data 
points on the interval [0,3] is illustrated in Figure 3. 

f x(t) = — x ( t ) + x ( t— r j ) - x(t-r 2 ), t > 0 
1 x( t) = t + 1 , t < 0 
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i terate 

r l 

r 2 

error 


0 

0.600 

4.000 

7.500 


1 

1.569 

3.216 

2.295 


2 

1.146 

2.100 

0.100 


3 

0.977 

1.998 

0.034 


4 

0.978 

2.003 

0.032 

Example 6.5. 

The equation 

and data 

for this 

example are the same as in 

Example 6.4. 

In this case 

the initial guess 

reverses the order of the 

’'true" delay 

values , The 

results of 

this iteration are given below and 

covergence of 

the states on the interval [0,3] is illustrated in Figure 4 


i terate 

r l 

T 2 

error 


0 

2.000 

1.000 

2.460 


1 

0.483 

1.151 

1.379 


2 

1.561 

2.014 

0.788 


3 

1.100 

2.072 

0 . 077 


4 

0.980 

2.002 

0.033 


Example 6.6 . In this case the algorithm is asked to estimate parameters in 

a delay model of a system with no delay, fen data points on the interval 
[0,2] are computed from the exponential solution of 


x ( t ) = -2x( t) 
x(0) = 1 


and the algorithm is asked to estimate unknown parameters in the system 

f x(t) = ax(t) + bx(t-r), t > 0 
l x ( t ) = t + 1, t < 0 
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The first four iterations are given below: 


iterate 

a 

b 

r 

error 

0 

-3.000 

3.000 

2.000 

1.2577 

1 

-3.060 

-0.637 

1.947 

0.2551 

2 

-1.687 

0.235 

1.981 

0.1144 

3 

-1.967 

0.025 

1.985 

0.0110 

4 

-2.000 

0.000 

1.986 

0.0001 


On the fifth iteration the algorithm aborted when it was asked to invert a 
nearly singular matrix. This reflects the fact that at the true parameter 
values the state is completely insensitive to the delay. 

Example 6.7 . This case is the same as the previous example except that the 

data is taken from the closed form solution of the nonhomogeneous undelayed 
equation 

( x( t) = -2x( t) + u( t) 

\ x(0) = 1 

where u is the same step function as in Example 6.3. The results are 
similar to those of the previous example. 


i terate 


a 

b 

r 

error 

0 


-3.000 

3.000 

2.000 

1.3135 

1 


-2.848 

0.099 

1.804 

0.5121 

2 


-1.841 

0.138 

2.401 

0.0811 

3 


-1.971 

0.003 

2.508 

0.0197 

Example 

6.8. In 

this 

example 

we consider 

the second 

-order equation 


d ;<o 

dt 2 

2 

+ w 

X ( t ) + Bq 

dx, , \ 

dT t-r + a 

jX(t-r) = 

u( t) , t > 0, 


, x(t) = 

1, t 

< o, 
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where u(t) is the step function of Example 6.3. This equation models a 
harmonic oscillator with retarded damping and restoring forces. In [13] a 
quasi 1 inearization algorithm is used to estimate coefficients in this 
equation. The methods of this paper allow the delay r to be added to the 
set of unknown parameters. For this example the averaging method was used 
to compute "data" values for the parameter estimation algorithm with "true" 
values of w = 6, a 0 = a i = 9, anc ^ r = 1- The results of the iterative 

algorithm are given below and the convergence of the states (displacement 
and velocity) on the interval [0, 2] is illustrated in Figures 5 and 6. 


i terate 

w 

a o 

a l 

r 

error 

0 

4.100 

4.600 

6.300 

1.500 

15.212 

1 

5.073 

6.025 

-8.338 

0.918 

15.181 

2 

6.705 

4.710 

-0.682 

1.524 

12.389 

3 

6.188 

-14.677 

-4.838 

1.102 

31.950 

4 

5.902 

12.347 

8.396 

1.068 

25.234 

5 

5.964 

2.994 

8.980 

1.061 

2.186 

6 

5.995 

2.416 

9.016 

1.004 

0.344 

7 

6.000 

2.503 

8.999 

1.000 

0.007 
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